AD-A216  748 


one  FILF  con 

h  # 

V 


ANALYSIS  OF  FIRST-TERM  ATTRITION  OF 
NON-PRIOR  SERVICE  HIGH-QUALITY 
U.S.  ARMY  MALE  RECRUITS 


DTIC 


by 

HOA  GENERAZIO 

B.S.,  United  States  Military  Academy 
(1977) 


SUBMITTED  IN  PARTIAL  FULFILLMENT  OF  THE 
REQUIREMENTS  FOR  THE  DEGREE  OF 

MASTER  OF  SCIENCE  IN  OPERATIONS  RESEARCH 

at  the 

MASSACHUSETTS  INSTITUTE  OF  TECHNOLOGY 
December  1989 

©  Hoa  Generazio,  1989,  All  rights  reserved 


The  author  hereby  grants  to  MIT  permission  to  reproduce  and  to 
distribute  copies  of  this  thesis  document  in  whole  or  in  part. 


distribution  statement  a 

Approved  ior  public  release 
Pfrthbution  Uaiiarfted 


Signature  of  Author. 


JL 


,/J 

qc 


Operations  Research  Center 
December  1989 


Certified  by 


Accepted  by_ 


Arnold  I.  Barnett 
Professor  of  Operations  Research  and  Management 

Thesis  Advisor 


JVU, 


Amedeo  R.  Odoni 
Professor  of  Aeronautics  and  Astronautics  and  of  Civil  Engineering 

Co-director,  Operations  Research  Center 


oo  oi  u  m 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OMB  No.  0704-0188 


nwuifwn  u>* p— °«a-  ***  mwwwng  tru  cwcuon  o t  Wormmon.  Stndeonrmnmng3rangiNii)un>4nmiumforinyotnm  Mutagiriltooilacaanaiiniamman  noualiw  »uca»»uy» 
W  rjduanq  ifl«  Ou«*n.  10  Wasrwgur  HMOatunm  Stmem.  Olraaor**  la  Wornawn  Ceram  wta  Aecora.  1215  J«t»r»on  Oav«  HioJuum,  Sum  law.A^tmwi,  VA223H-<Ka.  io 
1W  Offlo  rt  Intomaion  and  Byutey  Main.  Ontca  of  Uanaoonan  an a  Bu<*}«f,  WaaJvnqton,  OC  20503. 


2.  REPORT  DATE  3.  REPORT  TYPE  ANO  DATES  COVERED 


1 .  AGENCY  USE  ONLY  (U *vw  Siartg 


13  December  1989 


Final 


*.  TtTLE  ANO  SUBTITLE 

Analysis  of  First-Term  Attrition  of  Non-Prior  Service  High- 
Quality  U.S.  Army  Male  Recruits  ' 


6.  AUTHOR(S) 


MAJ  Hoa  Generazio 


7.  PERFORMING  ORGANIZATION  NAM£(S)  ANO  ADDRESSEES) 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


9.  SPONSORING/ MONITORING  AGENCY  NAM£(S)  ANO  ADDRESSEES) 
HQD A ,  MILPERCEN  (DAPC-OPB-D) 


10.  SPONSORINGMONITORING  AGENCY 
REPORT  NUMBER 


12a.  OISTHIBOTIQN/AVAILABIUTY  STATEMENT 


12J>.  DISTRIBUTION  CODE 


-  _ _ _ . .  I 


13.  abstract  (Maximum  200  words)An  analy  s  i  s  is  performed  to  estimate  an  individual’s  probability 
>f  attrition  in  terms  of  certain  of  his  characteristics  at  time  of  enlistment.  The 
tsItt  technique  is  logistic  regression  modelling,  which  is  applied  to  data  pertaining 
:o  the  high-quality  male  population  of  the  U.S.  Army  FY84  NPS  accession  cohort (high- 
jchool  graduates  who  scored  50%  or  higher  on  the  Armed  Forces  Qualification  Test(AFQTi 
fne  results  showed  four  significant  characteristics  are:  age,  level  of  education, 
iptitude  test  score,  and  entry  status  (with  or  without  a  waiver).  Age  and  entry  status 
?ere  positively  correlated  with  the  rate  of  attrition.  Conversely,  education  anel 
iptitude  test  score  were  negatively  correlated.  The  older  the  recruit,  the  more 
.ikely  the  person  is  to  drop  out.  The  recruit  also  is  in  a  higher  risk  category  for 
Lttrition  if  he  or  she  entered  the  Army  with  a  waiver.  The  better  educated  the 
•ecruit  is,  the  less  likely  the  person  will  drop  out.  The  higher  the  aptitude  test 
icore,  the  more  likely  that  the  recruit  will  remain  for) the  entire  obligated  tour. 


14.  SUBJECT  TERMS 

First  Term  Attrition;  Non-Prior  Service;  FY84  Accession  Cohort 
Significant  Characteristics;  Logistic  Regression;  Maximum 
Likelihood  Estimation 


7.  SECURITY  CLASSIFICATION 
Of  REPORTUdCLASS 


18.  SECURITY  CLASSIFICATION 
OF  THIS  PAGE  „„„ 

UNCLASS 


19.  SECURITY  CLASSIFICATION 
OF  ABSTRACT 

UNCLASS 


IS.  NUMBER  OF  PAGES 

50 


IS.  PRICE  CODE 


20.  LIMITATION  OF  ABSTRACT 


Sunoara  Form  29c.  880922  D**n 
PiWcnbMbf  4NSI  TI4.239-<( 


7540-01-280-5500 


ANALYSIS  OF  FIRST  TERM  ATTRITION  OF 
NON-PRIOR  SERVICE  HIGH-QUALITY 
U.S.  ARMY  MALE  RECRUITS 

by 

HOA  GENERAZIO 

Submitted  to  the  Operations  Research  Center  on  December  13,  1989 
in  partial  fulfillment  of  the  requirements  for 
the  degree  of  Master  of  Science  in  Operations  Research 

ABSTRACT 

The  United  States  Army  enlists  an  average  of  120,000  recruits  (  90%  male  )  each  year 
to  sustain  its  military  force.  Unfortunately,  an  average  of  30%  of  each  non-prior 
service  (  NPS  )  accession  cohort  (  year  group)  will ,  voluntarily  or  involuntarily,  depart 
the  Army  prior  to  completing  its  initial  tour  length  obligation.  For  military  readiness 
and  fiscal  reasons,  reducing  attrition  is  a  high  concern  of  Army  leaders. 

An  analysis  is  performed  to  estimate  an  individual’s  probability  of  attrition  in  terms  of 
certain  of  his  characteristics  at  time  of  enlistment.  The  main  analytic  technique  is 
logistic  regression  modelling,  which  is  applied  to  data  pertaining  to  the  high  quality 
male  population  of  the  U.S.  Army  FY84  NPS  accession  cohort  (  high  school  graduates 
who  scored  50%  or  higher  on  the  Armed  Forces  Qualification  Test  (  AFQT )). 

The  results  of  the  analysis  showed  four  significant  characteristics  are:  age,  level  of 
education,  aptitude  test  score,  and  entry  status  (  with  or  without  a  waiver  ). 

Age  and  entry  status  were  positively  correlated  with  the  rate  of  attrition.  Conversely, 
education  and  aptitude  test  score  were  negatively  correlated.  The  older  the  recruit, 
the  more  likely  the  person  is  to  drop  out.  The  recruit  also  is  in  a  higher  risk  category 
for  attrition  if  he  or  she  entered  the  Army  with  a  waiver.  The  better  educated  the 
recruit  is,  the  less  likely  the  person  will  drop  out.  The  higher  the  aptitude  test  score, 
the  more  likely  that  the  recruit  will  remain  for  the  entire  obligated  tour. 


Thesis  Supervisor:  Professor  Arnold  I.  Barnett 

Title:  Professor  of  Operations  Research  and  Management 


2 


ACKNOWLEDGEMENTS 


I  want  to  thank  my  wife,  Susan,  and  my  three  children,  Christopher,  Andrew 
and  Alaina,  for  their  love  and  support. 

I  want  to  thank  my  academic  advisor  and  thesis  supervisor,  Amie  Barnett. 

His  patient,  support  and  constructive  comments  guided  my  work  from  the  beginning  to 
end. 


I  want  to  thank  The  Department  of  Mathematics,  United  States  Military 
Academy  for  giving  me  the  opportunity  to  pursue  a  masters'  degree  in  Operations 
Research. 

I  want  to  thank  Ilze  S.  Levis  for  her  skillful  and  critical  reviews,  of  my 
numerous  drafts,  for  technical  style,  logical  organization  and  overall  quality. 


Accesion  For 

NTIS  CHA&f  £ 

one  tab  £ 

U:.d  )iu>  ,  (  i-,1  £ 

Justific.i:  ..i ■ 

_ • _ 

Di.sHibitioi.  I 

AvjilaL-iKy  Codes 

Avail  dr  d/or 
Special 


3 


TABLE  OF  CONTENTS 


TITLE  PAGE  1 

ABSTRACT  2 

ACKNOWLEDGEMENTS  3 

TABLE  OF  CONTENTS  4 

LIST  OF  TABLES  5 

LIST  OF  FIGURES  6 

I.  INTRODUCTION 

1.1  Background  7 

1.2  Approach  8 

1.3  Organization  of  Thesis  9 

II.  ANALYSIS  METHODS  AND  DATA  BASE 

2.1  Selection  of  Variables  10 

2.2  Chi-Square  Test  12 

2.3  Logistic  Regression  ( Logit )  13 

2.4  Maximum  Likelihood  Estimation  ( MLE  )  16 

2.5  Data  Base  17 

III.  ATTRITION  ANALYSIS 

3.1  Demographic  Profile  -  General  19 

3.2  Demographic  Profile  -  High-Quality  Males  22 

3.3  Demographic  Profile  -  High-Quality  Males  -  Sample  Group  23 

3.3.1  Age  23 

3.3.2  Education  Level  26 

3.3.3  Aptitude  Category  30 

3.3.4  Waiver  Status  31 

3.4  Model  Analysis  33 

3.4.1  Chi-Square  Test  Results  33 

3.4.2  Logit  and  MLE  results  34 

3.5  Model  Validation  39 

IV.  CONCLUSIONS 

4. 1  General  45 

4.2  Factors  Contributing  to  Attrition  45 

4.3  Policy  Implications  and  Future  Research  46 

4.3.1  Policy  Implications  46 

4.3.2  Future  Research  46 

Appendix  A  Waiver  category  48 

BIBLIOGRAPHY  49 


4 


TABLE  OF  CONTENTS 


TITLE  PAGE  1 

ABSTRACT  2 

ACKNOWLEDGEMENTS  3 

TABLE  OF  CONTENTS  4 

LIST  OF  TABLES  5 

LIST  OF  FIGURES  6 

I.  INTRODUCTION 

1.1  Background  7 

1.2  Approach  8 

1.3  Organization  of  Thesis  9 

H.  ANALYSIS  METHODS  AND  DATA  BASE 

2.1  Selection  of  Variables  10 

2.2  Chi-Square  Test  12 

2.3  Logistic  Regression  ( Logit )  13 

2.4  Maximum  Likelihood  Estimation  ( MLE )  16 

2.5  Data  Base  17 

m.  ATTRITION  ANALYSIS 

3.1  Demographic  Profile  -  General  19 

3.2  Demographic  Profile  -  High-Quality  Males  22 

3.3  Demographic  Profile  -  High-Quality  Males  -  Sample  Group  23 

3.3.1  Age  23 

3.3.2  Education  Level  26 

3.3.3  Aptitude  Category  30 

3.3.4  Waiver  Status  31 

3.4  Model  Analysis  33 

3.4.1  Chi-Square  Test  Results  33 

3.4.2  Logit  and  MLE  results  34 

3.5  Model  Validation  39 

IV.  CONCLUSIONS 

4.1  General  45 

4.2  Factors  Contributing  to  Attrition  45 

4.3  Policy  Implications  and  Future  Research  46 

4.3.1  Policy  Implications  46 

4.3.2  Future  Research  46 

Appendix  A  Waiver  category  48 

BIBLIOGRAPHY  49 


4 


LIST  OF  TABLES 


1.  Assignment  and  Definitions  of  Variables  11 

2.  Parameter  Values  from  Logit  Regression  34 

3.  Expected  vs  Observed  Rates  -  Sample  Group  35 

4.  Expected  vs  Observed  Rates  -  Hold-out  Sample  40 

5.  Expected  vs  Observed  Frequencies  -  Hold-out  Sample  42 


5 


LIST  OF  FIGURES 


1.  Logistic  Response  Function  14 

3.1  FY84  NPS  Accession  Cohort  -  Sex  18 

3.2  FY84  NPS  Accession  Cohort  -  High  Quality  vs  Low  Quality  19 

3.3  FY84  NPS  Accession  Cohort  -  Male  -  High  vs  Low  Quality  19 

3.4  FY84  NPS  Accession  Cohort  -  Female  -  High  vs  Low  Quality  20 

3.5  FY84  NPS  Accession  Cohort  -  High  Quality  -  Male  vs  Female  20 

3.6  FY84  NPS  Accession  Cohort  -  Attrition  Rates  -  High  Quality  Male  21 

vs  Total  Population 

3.7  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Age  -  23 

Total  Numbers  vs  Numbers  of  Attrition 

3.8  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Age  -  23 

Proportion  of  Attrition 

3.9  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Age  Group  -  24 

Total  Numbers  vs  Numbers  of  Attrition 

3.10  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Age  Group  -  24 

Proportion  of  Attrition 

3.11  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Education  Level  -  26 

Total  Numbers  vs  Numbers  of  Attrition 

3.12  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Education  Level  -  27 

Proportion  of  Attrition 

3.13  FY84  NPS  Accession  Cohort  -  Sample  Group  -  27 

by  Education  Level  Group  -  Total  Numbers  vs  Numbers  of  Attrition 

3.14  FY84  NPS  Accession  Cohort  -  Sample  Group  -  28 

by  Education  Level  Group  -  Proportion  of  Attrition 

3.15  FY84  NPS  Accession  Cohort  -  Sample  Group  -  29 

by  Aptitude  Category  -  Total  Numbers  vs  Numbers  of  Attrition 

3.16  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Aptitude  Category  -  30 

Proportion  of  Attrition 

3.17  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Entry  Status  -  31 

Total  Numbers  vs  Numbers  of  Attrition 

3.18  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Entry  Status  -  31 

Proportion  of  Attrition 

3.19  Residual  Plot  of  Expected  Attrition  Rates  vs  Observed  Attrition  Rates  37 

3.20  Expected  Frequencies  vs  Observed  Frequencies  42 


Chapter  1 


INTRODUCTION 


1.1  BACKGROUND 

The  United  States  Army  enlists  an  average  of  120,000  non-prior  service  recruits 
(  90%  male  )  each  year  to  sustain  its  military  force.  These  enlistees  receive  training 
to  qualify  them  for  a  military  occupational  specialty  (MOS).  Depending  upon  the 
particular  MOS,  the  training  can  last  from  four  months  to  a  year.  Unfortunately,  an 
average  of  30%  of  each  non-prior  service  ( NPS  )  accession  cohort  (  year  group)  will , 
voluntarily  or  involuntarily,  depart  the  Army  prior  to  completing  its  initial  tour  length 
obligation. 

First-term  attrition  is  very  costly  to  the  U.S.  Army,  since  force  size  and  readiness  is 
affected,  while  recruiting  and  training  costs  are  not  fully  recovered.  Army  leaders  and 
manpower  planners  are  concerned  about  the  high  attrition  rates,  and  numerous  studies 
have  been  done  to  find  explanations  (  Buddin,  1984;  Toomepuu,  1986;  Home,  1986 ). 

One  area  of  research  has  concentrated  on  determining  if  the  characteristics  of  a  recruit, 
at  time  of  enlistment,  are  significant  factors  in  determining  if  the  person  is  a  high  or  a 
low  risk  for  attrition.  Studies  have  shown  that  those  individuals  categorized  by  the 
U.S.  Army  as  high-quality  recruits  (  high  school  graduates  who  scored  50%  or  higher 


7 


on  the  Armed  Forces  Qualification  Test  (  AFQT  )!)  have  attrition  rates  at  about  half 
those  of  the  other  recruits  in  that  cohort  (Fernandez,  1985;  Armor,  1982;  Sinaiko, 
1981).  An  average  of  55%  per  accession  cohort  are  classified  as  high  quality. 

Extensive  studies  have  also  shown  that  male  recruits  have  markedly  different  attrition 
patterns  and  lower  attrition  rates  than  their  female  counterparts  (  Quester,  1986; 
Grissmer,  1985  ). 

1.2  APPROACH 


This  study  differs  from  previous  ones  in  that  it  examines  only  the  non-prior  service 
high  quality  male  population  of  an  accession  cohort.  It  attempts  to  determine,  within 
this  "  desirable  "  recruiting  population,  if  any  characteristics  of  a  recruit  can  be 
quantifiable  factors  in  categorizing  the  person  as  a  high  or  low  risk  for  attrition. 

The  reasons  for  analysis  of  this  particular  group  are  twofold.  First-term  attrition 
among  high-quality  recruits  is  especially  costly  since  these  recruits  have  higher 
associated  recruiting  and  training  costs  than  those  of  other  recruits.  The  male 
population  was  selected  since  it  generally  makes  up  between  85%  to  90%  of  the  total 
cohort  population. 

This  study  used  a  logistic  regression  model  and  maximum  likelihood  estimation  for 
estimating  parameter  values.  First,  a  large  sample  population  was  randomly  chosen 

1  The  score  is  a  composite  of  a  subset  of  the  individual  Armed  Services  Vocational 
Aptitude  Battery  (  ASVAB  )  component  scores,  reflecting  language  and  arithmetic 
skills,  and  is  used  as  a  measure  of  general  aptitude,  trainability  and  productivity. 


8 


from  the  entire  accession  cohort  population.  As  a  baseline,  a  Chi-square  test  was 
conducted  to  rule  out  the  possibility  that  variation  in  the  attrition  rates  among  the 
different  recruiting  groups  were  caused  by  merely  random  noise.  A  multivariate 
logistic  regression  model  was  then  developed  to  attempt  to  quantify  the  effects  of  the 
variables  considered  on  attrition  levels.  Maximum  likelihood  estimation  was  used  to 
obtain  the  best  estimation  of  model  parameters.  Model  validation  was  accomplished 
with  the  5%  hold-out  sample.  Once  the  model  was  validated,  it  was  used  to  estimate 
the  probability  of  attrition  for  individuals  with  specific  characteristics.  The  results  can 
be  used  by  Army  leaders  and  manpower  planners  to  determine  criteria  for  future 
enlistment  and  to  minimize  attrition. 


1.3  ORGANIZATION  OF  THESIS 

The  next  three  sections  are  organized  as  follows.  Section  II  describes  the  analysis 
methods  and  data  base  used  in  the  study.  Section  III  shows  the  results  about  the 
attrition  patterns  for  high-quality,  male  recruits  of  FY84  cohort.  Section  IV  reports  the 
conclusions. 


9 


Chapter  2 


ANALYSIS  METHODS  AND  DATA  BASE 


Attrition  was  modeled  as  a  function  of  certain  individual  traits.  A  Chi-square  test  was 
conducted  to  assure  that  the  variation  of  attrition  rates  exhibited  by  the  different 
groups  were  not  attributed  solely  to  randomness,  and  that  an  attempt  to  explain  the 
phenomenon  via  a  mathematical  model  was  worthwhile  and  appropriate. 

A  multivariate  logistic  regression  model  was  developed  to  assess  the  importance  of 
the  variables  hypothesized  to  influence  attrition.  Maximum  likelihood  estimation  was 
used  to  estimate  the  parameter  values  of  the  model,  and  allowed  the  model  to  be  used 
to  estimate  the  probability  of  attrition  for  an  individual  with  specific  characteristics. 

2.1  SELECTION  OF  VARIABLES 


Four  characteristic  traits  were  chosen  to  use  in  the  model.  These  particular  traits 
-  age,  education  level,  aptitude  test  score  category,  and  entry  status  —  all  have  one 
common  and  important  attribute;  the  Army  already  has  recognized  the  importance  of 
their  effects  on  force  readiness  and  has  set  entrance  criteria  and  limitations  for  each 
trait.  Thus,  if  the  results  of  this  study  show  that  these  traits  also  affect  the  attrition 
rate,  then  existing  regulations  can  be  changed  or  the  focus  of  recruiting  can  be  directed 
towards  a  particular  segment  of  the  population,  within  the  set  guidelines  and 
directives.  Variables,  deemed  discriminatory,  such  as "  race  "  and  "  number  of 


10 


dependents  ",  or  those  deemed  not  changeable  by  policy,  such  as  "  home  state  ",  were 
not  studied,  since  a  finding  would  be  of  interest,  but  of  minimal  value. 


Table  1  provides  the  definitions  of  the  variables  used  in  the  multivariate  analysis. 


TABLE  1 

ASSIGNMENT  AND  DEFINITIONS  OF  VARIABLES 


Variable 

Definition 

Age 

Xll 

Age  group  17-21 

X12 

Age  group  22-  25 

X13 

Age  group  26  -  29 

X14 

Age  group  >=  30 

Education  Level 

X21 

High  School  graduate 

X22 

High  School  graduate  with  <=  2  yrs  college 

X23 

High  School  graduate  with  >  2  yrs  college 

Aptitude  Category 

X31 

Cat  IIIA,  scored  between  50%  and  64% 

X32 

Cat  II  ,  scored  between  65%  and  92% 

X33 

Cat  I  ,  scored  above  92% 

Entry  Status 

X41 

Entry  without  waiver 

X42 

Entry  with  waiver  (medical,  legal ...) 

11 


2.2  CHI-SQUARE  TEST 


Recruits  are  grouped  together  by  same  age,  education  level,  aptitude  test  score 
category,  and  entry  status.  Suppose  there  are  k  cells,  and  one  hypothesizes  that  the 
probability  of  attrition  for  each  cell  i,  pj ,  where  i  =  l,2....k,  is  equal  to  p,  where  p  is  the 

probability  of  attrition  of  the  entire  population.  Under  this  hypothesis,  the  differences 
displayed  among  the  observed  individual  cell  attrition  rate  would  be  attributed  to 
randomness.  The  major  question  is:  does  Pi  =  p  ? 

The  Chi-square  test  is  used  on  the  data  to  test  this  hypothesis. 

Ho :  Pi  =  p  for  all  i 

Hl  -  Pi*P  (2.1) 

where:  i  =  1, ....  k  (  number  of  cells) 

p  *  0 

One  can  obtain  a  good  estimate  p  of  p,  and  let  D(p,  pi)  be  a  measure  of  the  distance 
between  p  and  p[ .  Reject  the  hypothesis  if  D  is  too  large;  otherwise  accept. 

Under  the  observed  probability  distribution  (pi,p2>—Pk) .  for  large  n 

l  P  (2.2) 

2 

has  approximately  a  X  k-l  distribution,  and  the  test  based  on  D  is  called  the  Chi- 
square  test  (  Breiman,  1973)  . 


12 


D  is  compared  with  a  test  computed  from  a  X  k- 1  distribution.  If  D  is  too  large,  i.e. 
greater  than  the  value  from  the  Chi-square  table  for  X  k-l  at  the  chosen  acceptance 
region,  then  H0  is  rejected. 

Chi-square  tables  normally  go  up  to  30  degrees  of  freedom  (k-l  =  30  ).  For  k-l  >  30, 
the  acceptance  region  for  D  is  calculated  as: 

D£(ft-l)  +  z  Y(*-  1  )  (2.3) 

where  z  is  computed  from  an  N(0,1)  table  (  Breiman,  1973  ). 

The  result  was  used  as  a  baseline  to  determine  whether  a  logistic  regression  was 
likely  to  show  statistically  significant  predictors  of  the  attrition  rate. 

2.3  LOGISTIC  REGRESSION  (  LOGIT ) 

A  recruit  either  separates  prior  to,  or  remains  in  service  until,  completion  of  the 
obligated  tour.  Let  us  consider  attrition  ,  Yj ,  as  a  binary  ( two- valued  )  dependent 
variable,  and  attempt  to  explain  it  from  the  independent  variables  ,  Xjj’s  ( individual 
characteristic  traits  ). 

The  dependent  or  response  variable  ,  Yj ,  has  a  Bernoulli  distribution,  and  is  defined 
as  : 


and 


Yj  =  1,  if  individual  i  did  not  complete  his  obligated  tour 
0,  if  individual  i  did  complete  his  obligated  tour 


E(Yi=l)=prob(Yi=l\Xij) 


(2.4) 


E  (  Yi  =  0  )  =  prob  (Yi  =  01  Xy ) 


13 


The  binary  observations  were  available  on  n  recruits,  assumed  to  be  independent. 
The  problem  was  to  use  the  data  to  develop  a  good  method  of  analysis  for  assessing 
the  dependence  of  the  probability  of  attrition  ,  on  Xjj  ,  the  characteristic  traits.  The 

usual  linear  regression  models  would  be  unsatisfactory  for  two  primary  reasons: 

1.  Ordinary  least  squares  linear  regression  models  require  that  the 
variance  of  Yj  be  constant,  whereas  in  this  case  var  (Yj )  is  a  function  of  the 
expected  Y{. 


2.  Linear  regression  models  do  not  have  the  constraint  0  <  p  <  1  that 
this  case  requires.  Linear  regression  models  could  thus  produce  absurd  results. 

Regression  of  a  logistic  response  function  is  more  appropriate,  since  it  allows  for  the 
variance  to  depend  on  the  mean  p  ,  and  it  does  not  allow  the  estimated  probabilities  to 
fall  outside  the  0  -  1  range.  Figure  1  illustrates  the  S-shaped  curve  of  the  logistic 
response  function,  which  asymptotically  approaches  zero  at  one  end,  and  one  at  the 
other  end. 


14 


E[Y] 


Figure  1  :  Logistic  Response  Function 


The  model  is  represented  as: 


£  (  y,=  1  )=prob  (  y,=  1  \Xy)  = 


g  (fio  +  X  PjXij) 

1  +  e(fio  +  X 


(2.5) 


where: 

Xjj  =  vector  of  characteristics  of  a  recruit 

prob  (  Yi  =  1  I  Xij )  =  probability  of  attrition  of  a  recruit  i,  with  characteristic  traits  Xjj 
B  ’  s  =  the  parameters  of  the  model. 


The  observed  values  of  Yj  can  be  used  to  fit  parameters  to  this  curve  and  thus  give 
estimates  bQ  ...  bj  of  B0...  Bj ,  and  p  for  prob  (  T/  =  1  I  Xij ) . 


One  can  take  a  log  transformation  ,  called  logit,  and  linearize  the  statistics: 


Logit  =  log  (prob  (  Yi  =  1  I  X,y )  /  prob  (  T,-  =  1  I  X,y ))  =  b0  + 1  bjXjj  (2.6) 

Thus,  prob  (  T,-  =  1  I  X,y )  is  assumed  not  linear  in  x;  instead,  it  is  the  logistic 
transformation  that  is  assumed  linear  in  x.  However,  this  technique  generally  gives 
satisfactory  estimates  only  for  grouped  data,  where  the  number  in  each  cell  is  required 
to  be  large,  and  only  when  each  prob  (  Yi  =  1  I  Xij )  is  sufficiently  far  from  0  and  1  so 


15 


that  the  observed  number  in  the  cell  is  approximately  normal.  In  this  situation  there 
existed  only  8  of  72  cells  with  small  population  and/or  prob  (  7;  =  1  I  Xg )  at  0  and  1. 

Nevertheless,  I  preferred  not  to  transform  the  data,  but  to  solve  the  nonlinear 
equations  directly,  using  maximum  likelihood  estimation  (  Morris,  1981  ). 

2.4  MAXIMUM  LIKELIHOOD  ESTIMATION  (  MLE  ) 

The  major  advantage  of  the  maximum  likelihood  method  resides  in  the  asymptotic 
properties  of  the  estimators.  Under  broad  conditions  (  Hanushek,  1977  ),  the 
maximum  likelihood  estimators  are: 

(  a  )  consistent 

(  b  )  asymptotically  efficient,  and 
( c  )  asymptotically  normal 

The  principle  of  maximum  likelihood  involves  choosing  b's  as  estimates  of  B's  ,  so 
that  if  they  were  actually  B 's,  the  given  observations  would  have  the  highest 
probability  of  occurrence. 

The  maximum  likelihood  estimation  was  accomplished  with  the  statistical  software 
package  SYSTAT  .  The  procedure  used  the  following  log-likelihood  function  : 

L(bn)  =  X  [  (failure)(log(estimate)  +  ( 1  -failure)(log(  1  -estimate))]  (2.7) 


where: 

bn  =  the  parameters  to  be  estimated 
failure  =  0  if  individual  is  counted  as  attrition, 
=  1  if  individual  remained  in  service 


16 


e  <fio  +  X  PjXii) 

estimate  =  1  +  e(A>  +  X 

X  =  sum  over  all  k  cells 

SYSTAT's  MLE  is  accomplished  via  minimizing  the  negative  of  the  log-likelihood 
function.  The  estimation  is  done  as  follows :  First,  a  model  and  a  loss  function  are 
specified.  In  this  case,  the  model  is  a  logit  model  and  the  loss  function  is  expressed  as 
the  negative  of  the  log-likelihood  function.  Starting  values  for  the  parameters  can  be 
entered  or  a  set  of  default  values  can  be  used.  The  model  is  evaluated  by  using  the 
starting  values,  the  result  is  called  the  estimate.  The  loss  statement ,  in  turn,  is 
evaluated  using  the  estimate  values.  The  procedure  is  repeated  for  all  the  cells  in  the 
file  and  the  loss  is  summed  over  all  cells.  The  summed  loss  is  then  minimized  via  the 
Quasi-Newton  algorithm.  This  minimization  algorithm  uses  numerical  methods  of  the 
first  and  second  derivatives  of  the  loss  function  to  seek  a  minimum.  Iterations  continue 
until  the  tolerance  criterion  for  convergence  ( tolerance  =  .00001  )  is  reached. 

2.5  DATABASE 

The  primary  source  of  personnel  information  was  maintained  and  provided  by  the 
Defense  Manpower  Data  Center  (DMDC).  The  DMDC  cohort  file  for  each  fiscal  year 
contains  the  personal,  educational,  geographical,  and  professional  data,  from  entry 
until  separation,  of  all  individuals  who  were  identified  as  gains  during  that  fiscal  year. 
The  FY84  cohort  file  was  used  because  it  was  the  most  recent  cohort  that  has  data 
collected  for  a  minimum  of  4  years.  Recruits  can  enlist  for  up  to  4  years  of  service,  and 
we  wanted  to  consider  the  actual  attrition  status  of  those  personnel  studied. 


17 


The  entire  FY84  cohort  NPS  high-quality  male  recruit  population  was  51,140.  The 
model  was  calibrated  based  on  data  concerning  95%  (48,583)of  the  population.  The 
other  5%  (2,557)  were  kept  as  a  "  hold-out  "  sample  for  model  validation. 


18 


Chapter  3 


ATTRITION  ANALYSIS 

3. 1  DEMOGRAPHIC  PROFILE  -  GENERAL 

The  FY84  accession  cohort  was  composed  of  131,933  persons,  of  which  114,681 
(86.9%)  were  males  and  17,252  (13.1%)  were  females.  There  were  61,994  (47.0%) 
high-quality  recruits.  Male  high-quality  recruits  numbered  51,140  (44.6%  of  cohort 
males)  and  females  numbered  10,854  (62.9%  of  cohort  females).  Within  the  high- 
quality  population,  82.5%  were  males. 


13.1% 


86.9% 

Male 


Figure  3. 1  :  FY84  NPS  Accession  Cohort 

Sex 


19 


ession  Cohort 
Low  Quality 


44.6% 

High 


37.1% 

Low 

Quality 


Figure  3.4  :  FY84  NPS  Accession  Cohort 
Female  -  High  vs  Low  Quality 


82.5% 

Male 


Figure  3.5  :  FY84  NPS  Accession  Cohort 
High  Quality  -  Male  vs  Female 


21 


3.2  DEMOGRAPHIC  PROFILE  -  HIGH-QUALITY  MALES 


There  were  11,354  high-quality  male  recruits  who  separated  from  the  Army  prior  to 
completion  of  their  obligated  tours.  The  attrition  rate  for  this  group  was  22.2%,  as 
compared  to  the  30.2%  for  the  overall  accession  cohort. 


200000  i 


131,933 


Total  Population  High  Quality  Male 

Figure  3.6  :  FY84  NPS  Accession  Cohort 
Attrition  Rates  -  High  Quality  Male  vs  Total 
Population 


22 


3.3  DEMOGRAPHIC  PROFILE  -  HIGH  QUALITY  MALES  -  SAMPLE  GROUP 

3.3.1  AGE 

Recruits’  ages  ranged  from  17  to  35,  the  allowable  limits  for  enlistment.  The  two  ages 
with  the  highest  enlistment  numbers  were  age  18,  with  17347  (35.7%)  and  age  19, 
with  8870  (18.3%).  The  six  highest  attrition  rates  belonged  to  ages  30  -  35.  The 
profiles  are  shown  at  Figures  3.7  and  3.8.  The  recruits  were  then  separated  into  four 
major  age  groups:  17  -  21  ( prime  -  target  for  recruiting  ),  22  -  25, 26  -  29,  and  30  +. 
Their  attrition  rates  are  21.8%,  21.9%,  26.8%,  and  36.1%,  respectively.  Figure  3.9 
displays  the  age  group  populations,  and  the  associated  attrition  proportions  are  shown 
at  Figure  10.  Generally,  the  attrition  rate  increased  as  age  increased.  A  recruit  in  an 
older  age  group  had  a  higher  rate  of  attrition  than  a  recruit,  with  similar  characteristic 
traits,  of  a  younger  age  group. 


23 


O  13  O  1  hB  3  O  ^  P  1-1  C  TJ 


20000 


Age 

Figure  3.7  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Age  - 
Total  Numbers  vs  Numbers  of  Attrition 


Figure  3.8  :  FY84  NPS  Sample  Group  -  by  Age  - 
Proportion  of  Attrition 


24 


P 


u 

l 

a 

t 

i 


n 


30000 


20000 


10000 


0 


17  -  21  22  -  25 


H  Total  Number  of  Recruits 
B  Number  of  Attritions 


26  -  29  £  30 


Age  Group 


Figure  3.9  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Age  Groups 
Total  Numbers  vs  Numbers  of  Attrition 


0.4  -i 


17  -  21  22  -  25  26  -  29  ;>  30 

Age  Group 

Figure  3.10  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Age  Groups  - 
Proportion  of  Attrition 


25 


3.3.2  EDUCATION  LEVEL 


High  quality  recruits’  education  levels  1  ranged  from  high  school  graduates  to  PhD's. 
The  two  education  levels  with  the  highest  enlistment  numbers  were  high  school 
graduate,  with  40,505  (83.4%)  and  high  school  graduate  with  one  year  of  college 
education,  with  2,962  (6.1%).  The  highest  attrition  rate  belonged  to  the  high  school 
graduate  level,  at  23.3%.  The  data  is  displayed  at  Figures  3.11  and  3.12.  The  recruits 
were  combined  into  three  major  education  level  groups:  high  school  graduate,  high 
school  graduate  with  two  or  fewer  years  of  college  education,  and  high  school  graduate 
with  more  than  two  years  of  college  education.  Their  attrition  rates  are  23.3%,  17.7%, 
and  15.6%,  respectively.  The  education  level  groups  and  their  respective  attrition 
rates  are  graphically  displayed  by  Figure  3.13  and  3.14.  Generally,  the  attrition  rate 
decreased  as  the  education  level  increased.  A  recruit  in  a  lower  education  level  group 
had  a  higher  risk  of  attrition  than  a  recruit,  with  similar  characteristic  traits,  of  a  higher 
education  level  group. 

One  must  be  careful  with  this  result  since  age  and  education  level  are  correlated.  As 
one  can  reasonably  expect,  an  older  person  is  more  likely  to  have  a  higher  education 
level  and  vise  versa.  But  as  one  is  older,  one  is  less  likely  to  remain  for  the  full  service 
tour.  Therefore,  there  may  be  cancelling  effects  on-going.  A  move  in  one  direction  for 
age  is  associated  with  the  a  move  in  the  opposite  direction  for  education  level. 


*6  =  High  School  Graduate  (HSG);  7  =  HSG  +  1  yr  college;  8  =  HSG  +  2  yr  college; 
9  =  HSG  +  3  or  4  yrs  college  (No  degree);  10  =  College  Grad;  11  =  Masters; 

12  =  PhD; 


26 


Oh  9 


Figure  3.1 1  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Education  Level  - 
Total  Numbers  vs  Numbers  of  Attrition 


27 


Figure  3.12  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Education  Level  - 
Proportion  of  Attrition 


50000  -i | 


H9G  HSG  +  S  2  HSG  +  >  2 

Education  Level  Group 

Figure  3.13  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Education  Level  Group  - 
Total  Numbers  vs  Numbers  of  Attrition  - 


28 


p 

r 

o 

P 

o 

r 

t 

i 

o 

n 


Figure  3.14  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Education  Level  Group  - 
Proportion  of  Attrition 


29 


3.3.3  APTITUDE  CATEGORY 


High  quality  recruits’  aptitude  levels  ranged  from  Category  IIIA  to  Category  I.  1  The 
two  aptitude  levels  with  the  highest  enlistment  numbers  were  Category  II,  with  26028 
(53.6%)  and  Category  IDA,  with  17140  (35.3%).  The  highest  attrition  rate  belonged 
to  the  Category  IIIA,  at  24.5%.  Their  attrition  rates  are  21.8%  for  Category  II,  and 
17.6%  for  Category  I.  The  attrition  rate  decreased  as  the  aptitude  level  increased. 


P 

o 

P 

u 

1 

a 

t 

i 

o 

n 


30000  n 


Aptitude  Category 


Figure  3.15  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Aptitude  Category  - 
Total  Numbers  vs  Numbers  of  attrition 


1  Cat  mA  :  AFQT  Score  =  50%-64%  ;  Cat  II  :  AFQT  Score  =  65%-92%; 
Cat  I :  AFQT  Score  =  >  92% 


30 


Aptitude  Category 

Figure  3.16  :  FY84  NPS  accession  Cohort  -  Sample  Group  -  by  Aptitude  Category  - 
Proportion  of  Attrition 

3.3.4  WAIVER  STATUS 

Recruits  came  into  the  Army  either  with  or  without  a  waiver.  A  person  meeting  all 
entry  requirements  will  enter  without  a  waiver.  A  person  desiring  to  enter  the  Army 
who  does  not  meet  the  normal  entry  standards,  may  request  a  waiver  for  enlistment. 

For  example,  a  person  not  meeting  the  age  limit  of  17  to  35  may  request  a  waiver  for 
age.  Another  can  request  a  waiver  for  physical  qualification  if  he  or  she  is  not  deemed 
physically  fit  by  entry  standards.  There  are  numerous  waiver  categories,  as  listed  in 
Appendix  A.  Only  a  very  small  group  entered  with  waivers  .  They  numbered  5,500 
(11.3%),  with  an  attrition  rate  of  27.5%.  The  attrition  rate  for  those  who  entered  the 
Army  without  a  waiver  was  21.6%. 


c  n 


P 

o 

1 

a 

t 

i 

o 

n 


50000  -i 


40000  i 


30000  H 


20000  H 


ioooo  H 


0 


Total  Number  of  Recruits 
Numbers  of  Attrition 


Without  Waiver  With  Waiver 

Entry  Status 

Figure  3.17  :  FY84  NPS  Accession  Cohort  -  Sample  Group  -  by  Entry  Status 
Total  Numbers  vs  Numbers  of  Attrition 


0.3 


Without  Waiver  With  Waiver 

Entry  Status 

Figure  3.18  :  FY84  NPS  \ccession  Cohort  -  Sample  Group  -  by  Entry  Status 
Proportion  of  Attrition  - 


32 


3.4  MODEL  ANALYSIS 


3.4.1  CHI-SQUARE  TEST  RESULTS 

The  Chi-square  test  strongly  rejected  the  hypothesis  that  the  attrition  rates  in  each  of 
the  separate  cells  were  the  same.  Each  cell  is  a  grouping  of  individuals  belonging  to 
the  same  age  group,  education  level  group,  aptitude  test  score  category  and  entry 
status.  There  were  4  age  groups  (  XI 1  ,  X12,  X13,  X14  ),  3  education  level  groups 
( X21,  X22,  X23  ),  3  aptitude  test  score  category  groups  (  X31,  X32,  X33  ),  and 
2  entry  status  groups  (  X41,  X42  ).  As  a  result,  we  had  72  separate  cells.  An 
example  of  a  cell  would  be  XI 1X21X31X41,  meaning  the  cell  consists  of  individual  of 
age  group  17-21,  high  school  graduate,  aptitude  category  mA,  and  no  waiver 
required  for  entry.  The  test  results  suggested  that  the  variation  seen  among  the 
attrition  rates  around  the  overall  average  was  not  due  to  random  noise.  The 
Chi-square  value  for  our  data  was  509.4.  The  95  *  percentile  of  the  Chi-square 
distribution  with  70  degrees  of  freedom  is  88.4. 

The  Chi-square  test  is  generally  very  cruel  to  large  sample  sizes,  and  one  wonders 
whether  it  is  the  test  being  harsh  on  the  data,  or  there  is  a  major  problem  with  the 
hypothesis.  In  this  case,  there  was  little  room  for  doubt.  The  discrepancy  was  huge. 
The  relevant  X  distribution  has  a  mean  of  70,  variance  of  140,  and  standard  deviation 
of  11.83.  Thus,  the  test  Chi-square  value  was  over  35  standard  deviations  above  the 
mean.  The  test  results  encouraged  further  efforts  to  seek  a  prediction  rule  that  yields 
different  forecasts  for  different  groups  of  recruits. 


33 


3.4.2  LOGIT  AND  MLE  RESULTS 


The  model  with  age,  education,  aptitude  score  category  and  entry  status  was 
formulated  with  the  logistic  function.  MLE  was  used  to  obtain  the  parameter  values. 
The  regression  model  was  entered  into  SYSTAT,  as  follows: 
model  failure  =  exp(b0  +  bi*Age  +  b2*Ed  Level  +  b3*Aptitude  Cat  +  (3.1) 

b4*Entry  Status)  +  (  1  +  exp(b0  +  bi*Age  +  b2*Ed  Level  + 
b3*Aptitude  Cat  +  b4*Entry  Status) 


loss  =  -count*(failure*log(estimate)+(l-failure)*log(l-estimate))  (3.2) 


where: 

failure  =  0  if  counted  as  attrition 

=  1  if  not  counted  as  attrition 
bn  =  parameter  values  for  n  =  0,1, 2, 3, 4 
Age  =  lifXll  ,2ifX12,3ifX13,4ifX14 
Ed  Level  =  1  if  X21, 2  if  X22, 3  if  X23 
Aptitude  Cat  =  1  if  X31, 2  if  X32, 3  if  X33 
Entry  Status  =  1  if  X41,  2  if  X42 
loss  *  MLE  loss  function 

count  =  number  of  individuals  within  that  cell,  associated  with  failure  =  1  or  =  0 
estimate  =  value  of  model  failure  computed  in  the  prior  iteration 
Results  of  the  regression  are  shown  at  Table  2. 


34 


TABLE  2 


PARAMETER  VALUES  FROM  REGRESSION 


Parameter 

Estimate 

Standard  Error 

bo 

-1.190571 

0.075746 

bl 

0.310982 

0.013333 

b2 

-0.414799 

0.036670 

b3 

-0.156133 

0.039601 

b4 

0.272056 

0.037916 

The  parameter  values  implied  that  age  and  entry  status  were  positively  associated 
with  the  dependent  variable.  It  meant  that  higher  age  and/or  entry  with  a  waiver  would 
result  in  a  higher  attrition  rate.  Conversely,  education  level  and  aptitude  category 
were  negatively  associated  with  the  dependent  variable.  Thus,  the  higher  one's 
education  level  and/or  apth’Mf  category,  the  lower  one’s  probability  of  attrition. 


The  parameter  values  were  entered  into  the  model's  logistic  function  and  all  predicted 
attrition  rates  were  calculated.  These  rates  and  their  corresponding  observed  rates 
are  shown  in  Table  3.  There  were  45  cells  (66.2%)  that  had  predicted  values  within 
one  standard  deviation  of  their  means,  16  cells  (23.5%)  that  had  predicted  values 
between  one  and  two  standard  deviations  away,  and  the  remaining  7  cells  (10.3%) 
had  predicted  values  beyond  two  standard  deviations  away.  This  looked  good,  since 
with  68  cells,  even  if  the  probabilistic  prediction  was  right,  about  68%  should  be  within 
one  standard  deviation  of  the  mean  and  95%  within  two  standard  deviations  of  the 
mean. 


35 


TABLE  3 


Age  =  lifXll,2ifX12,3ifX13,4ifX14 
Education  =  1  if  X21, 2  if  X22, 3  if  X23 
Aptitude  Category  =  1  if  X31, 2  if  X32, 3  if  X33 
Entry  Status  =  1  if  X41, 2  if  X42 

Age  Education  Aptitude  Entry  Population  Expected  Observed 


1 

1 

Category 

1 

Status 

1 

12843 

Attrition  Rate 
.235  .233 

i 

1 

1 

2 

1350 

.288 

.316 

l 

1 

2 

1 

16810 

.208 

.212 

l 

1 

2 

2 

1740 

.257 

.272 

l 

1 

3 

1 

2182 

.184 

.182 

l 

1 

3 

2 

240 

.228 

.167 

l 

2 

1 

1 

428 

.169 

.143 

l 

2 

1 

2 

42 

.211 

.167 

l 

2 

2 

1 

1413 

.148 

.148 

l 

2 

2 

2 

195 

.186 

.169 

l 

2 

3 

1 

580 

.129 

.129 

l 

2 

3 

2 

65 

.163 

.138 

l 

3 

1 

1 

14 

.118 

.071 

l 

3 

1 

2 

1 

.150 

.000 

l 

3 

2 

1 

68 

.103 

.162 

l 

3 

2 

2 

5 

.131 

.000 

l 

3 

3 

1 

47 

.089 

.064 

l 

3 

3 

2 

2 

.114 

.000 

2 

1 

1 

1 

1047 

.296 

.289 

2 

1 

1 

2 

276 

.355 

.304 

2 

1 

2 

1 

1757 

.264 

.272 

2 

1 

2 

2 

493 

.320 

.284 

2 

1 

3 

1 

347 

.235 

.196 

2 

1 

3 

2 

96 

.287 

.281 

2 

2 

1 

1 

266 

.217 

.233 

2 

2 

1 

2 

46 

.267 

.283 

2 

2 

2 

1 

832 

.192 

.181 

2 

2 

2 

2 

168 

.238 

.268 

2 

2 

3 

1 

349 

.169 

.158 

2 

2 

3 

2 

86 

.210 

.291 

2 

3 

1 

1 

178 

.155 

.124 

2 

3 

1 

2 

20 

.194 

.150 

2 

3 

2 

1 

828 

.135 

.118 

2 

3 

2 

2 

84 

.171 

.060 

2 

3 

3 

1 

650 

.118 

.117 

36 


TABLE  3  (Continued') 


Age  Education  Aptitude  Entiy 
Category  Status 


3 

1 

1 

1 

1 

1 

1 

2 

2 

2 

2 

2 

2 

3 

3 

3 

3 

3 

3 

1 

1 

1 

1 

1 

1 

2 

2 

2 

2 

2 

2 

3 

3 

3 

3 

3 

3 


3 

1 

1 

2 

2 

3 

3 

1 

1 

2 

2 

3 

3 

1 

1 

2 

2 

3 

3 

1 

1 

2 

2 

3 

3 

1 

1 

2 

2 

3 

3 

1 

1 

2 

2 

3 

3 


2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 

1 

2 


Population 

Expected  Observed 
Attrition  Rate 

67 

.150 

.134 

226 

.364 

.354 

63 

.429 

.476 

444 

.329 

.315 

152 

.392 

.362 

97 

.296 

.309 

29 

.355 

.241 

88 

.275 

.227 

12 

.332 

.417 

224 

.245 

.246 

42 

.298 

.095 

85 

.217 

.188 

19 

.267 

.263 

69 

.200 

.246 

8 

.247 

.250 

317 

.176 

.192 

41 

.219 

.195 

245 

.155 

.184 

32 

.194 

.219 

84 

.439 

.429 

16 

.507 

.500 

129 

.401 

.388 

38 

.468 

.500 

43 

.364 

.419 

3 

.429 

.000 

25 

.341 

.400 

6 

.404 

.333 

75 

.307 

.293 

15 

.367 

.533 

36 

.274 

.278 

9 

.332 

.111 

29 

.254 

.448 

3 

.309 

.557 

142 

.226 

.338 

16 

.277 

.312 

86 

.200 

.256 

20 

.247 

.300 

37 


The  residual  plot  is  displayed  at  Figure  3.19.  The  distribution  of  the  residuals  were 
fairly  random,  with  a  slightly  larger  spread  at  the  higher  cell  groups. 


Figure  3. 19  :  Residual  Plot  of  Expected  Attrition  Rates 
vs  Observed  Attrition  Rates 


The  model  fitted  the  data  very  well.  The  explanatory  variables  associated  positively 
or  negatively,  as  one  would  have  anticipated,  with  the  dependent  variable.  A  look  at 
the  extreme  ends  of  the  cells  showed  that  the  model  behaved  excellently  there.  The 
highest  risk  group,  per  the  model,  is  cell  X 14X21X31X42  (  age  £  30,  HS  graduate, 
Cat  Ilia,  with  waiver  ).  It  had  a  predicted  attrition  rate  of  50.7%,  versus  a  50.0%  for 
the  observed  rate.  The  lowest  risk  group  is  cell  Xi  1X23X33X41  (  age  17  -21,  HS 
graduate  with  >  2  yrs  of  college,  cat  I,  without  waiver  ).  It  had  a  predicted  attrition 


38 


rate  of  8.9%,  while  the  observed  rate  was  6.4%.  The  differences  are  not  statistically 
significant. 


3.5  MODEL  VALIDATION 

To  validate  the  original  model,  the  5%  hold-out  sample  was  used.  The  question  is 
whether  the  pattern  that  prevailed  in  95%  of  the  whole  population  will  also  prevail  in 
the  other  5%  that  had  nothing  to  do  with  model  formulation.  If  there  exists  a  good  fit, 
then  one's  confidence  in  the  model  increases.  It  means  that  the  model  did  more  than 
just  "  coddle"  the  data  of  the  sample  group. 

Only  59  of  the  72  cells  were  used  with  the  hold-out  sample.  The  deleted  cells  had 
zero  entries.  The  model  also  behaved  nicely  with  this  set  of  data.  The  highest  risk 
group,  among  the  59  cells,  is  cell  X14X21X32X42  (  age  £  30,  HS  graduate.  Cat  n, 
with  waiver ).  It  had  a  predicted  attrition  rate  of  46.8%,  versus  a  60.0%  for  the 
observed  rate.  The  lowest  risk  group  is  cell  Xi  1X23X33X41  (  age  17  -21,  HS 
graduate  with  >  2  yrs  of  college,  cat  I,  without  waiver ).  It  had  a  predicted  attrition 
rate  of  8.9%,  while  the  observed  rate  was  0.0%.  The  differences  again  are  not 
statistically  significant.  The  predicted  and  observed  values  are  listed  at  Table  4. 

To  further  examine  the  accuracy  of  the  model,  a  goodness-of-fit  test  between  the 
observed  and  expected  frequencies  was  conducted.  First,  we  calculated  where  the 
observed  attrition  rates  fell  as  percentiles  of  their  distributions  given  that  the  binomial 
parameters  within  each  group  were  correct.  We  did  this  by  using  the  technique  of 
normal  approximation  to  the  binomial  distribution  (  Walpole,  1978  )  and  the  z  table. 


39 


The  z  values  were  calculated  with  : 


z  =  XJLnp 

(3.3) 

where: 

X  =  observed  number  of  attrition  for  the  cell 
n  =  total  number  of  recruit  for  the  cell 
p  =  expected  probability  of  attrition  for  the  cell 
q  =  1  -p 

Vnpq  =  a  =  standard  deviation  for  the  cell 


To  ensure  that  the  approximation  is  fairly  good,  we  combined  some  cells  together  in 
order  for  n  £  4  per  cell.  As  such,  the  number  of  cells  was  reduced  to  28  from  59. 

We  then  hypothesized  that  the  distribution  of  these  probabilities  is  uniform,  with  2.8 
cells  expected  in  each  of  the  10  equally  spaced  percentile  cells.  The  observed 
frequencies  and  the  expected  frequencies  per  cell  are  listed  in  Table  5.  Figure  3.20 
provides  a  graphical  display  of  the  data. 


40 


TABLE  4 


EXPECTED  VS  OBSERVED  RATES  -  HOLD- OUT  SAMPLE 

Age  =  lifXll  ,2ifX12,3ifX13,4ifX14 
Education  =  1  if  X21, 2  if  X22, 3  if  X23 
Aptitude  Category  =  1  if  X31, 2  if  X32, 3  if  X33 
Entry  Status  =  1  if  X41,  2  if  X42 
"  -  "  =  no  entries  in  that  cell 


Age 

Education 

Aptitude 

Entry 

Population 

Expected 

Observed 

Category 

Status 

Attrition  Rate 

1 

1 

1 

1 

649 

.235 

.234 

1 

1 

1 

2 

64 

.288 

.297 

1 

1 

2 

1 

861 

.208 

.182 

1 

1 

2 

2 

94 

.257 

.277 

1 

1 

3 

1 

127 

.184 

.197 

1 

1 

3 

2 

7 

.228 

.143 

1 

2 

1 

1 

24 

.169 

.169 

1 

2 

1 

2 

3 

.211 

.333 

1 

2 

2 

1 

73 

.148 

.164 

1 

2 

2 

2 

10 

.186 

.100 

1 

2 

3 

1 

38 

.129 

.000 

1 

2 

3 

2 

1 

.163 

.000 

1 

3 

1 

1 

3 

.118 

.000 

1 

3 

1 

2 

- 

- 

- 

1 

3 

2 

1 

2 

.103 

.000 

1 

3 

2 

2 

- 

- 

- 

1 

3 

3 

1 

2 

.089 

.000 

1 

3 

3 

2 

- 

- 

- 

2 

1 

1 

1 

50 

.296 

.240 

2 

1 

1 

2 

16 

.355 

.312 

2 

1 

2 

1 

99 

.264 

.263 

2 

1 

2 

2 

29 

.320 

.379 

2 

1 

3 

1 

23 

.235 

.217 

2 

1 

3 

2 

2 

.287 

.500 

2 

2 

1 

1 

12 

.217 

.083 

2 

2 

1 

2 

3 

.267 

.333 

2 

2 

2 

1 

41 

.192 

.244 

2 

2 

2 

2 

12 

.238 

.000 

2 

2 

3 

1 

20 

.169 

.050 

2 

2 

3 

2 

4 

.210 

.000 

2 

3 

1 

1 

8 

.155 

.000 

2 

3 

1 

2 

1 

.194 

.000 

41 


TABLE  4  (Continued) 


Age  Education  Aptitude  Entiy  Population 
Category  Status 


Expected  Observed 
Attrition  Rate 


3 

3 

3 

3 

1 

1 

1 

1 

1 

1 

2 

2 

2 

2 

2 

2 

3 

3 

3 

3 

3 

3 

1 

1 

1 

1 

1 

1 

2 

2 

2 

2 

2 

2 

3 

3 

3 

3 

3 

3 


62 

.135 

.113 

8 

.171 

.250 

32 

.118 

.094 

3 

.150 

.000 

14 

.364 

.357 

26 

.329 

.308 

9 

.392 

.556 

3 

.296 

.000 

3 

.355 

.333 

8 

.275 

.375 

1 

.332 

.000 

11 

.245 

.182 

3 

.298 

.000 

6 

.217 

.000 

4 

.267 

.050 

3 

.200 

.000 

1 

.247 

1.000 

20 

.176 

.150 

3 

.219 

.000 

21 

.155 

.095 

7 

.439 

.429 

11 

.401 

.364 

5 

.468 

.600 

1 

.341 

1.000 

2 

.307 

.000 

1 

.367 

1.000 

2 

.274 

.000 

2 

.226 

.500 

1 

.277 

.000 

6 

.200 

.333 

42 


mi 


2.8 

3 

10  - 

20 

2.8 

4 

20  - 

30 

2.8 

5 

30  - 

40 

2.8 

2 

40  - 

50 

2.8 

6 

50  - 

60 

2.8 

1 

60  - 

70 

2.8 

3 

70  - 

80 

2.8 

2 

80  - 

90 

2.8 

2 

90  - 

100 

2.8 

0 

■ 

Expected  Frequency 

■ 

Observed  Frequency 

23456789  10 

Cell 


Figure  3.20  :  Expected  vs  Observed  Frequencies 


43 


The  goodness-of-fit  test  between  the  observed  and  expected  frequencies  is  based  on 
the  quantity: 


k 

X2  =  £ 

i  =  l 


(  Qj  -  gj  )2 
ei 


(3.4) 


2 

where  X  is  a  value  of  a  random  variable  whose  sampling  distribution  is 
approximated  very  closely  by  the  chi-square  distribution,  oj  and  ei  represent  the 

observed  and  expected  frequencies,  respectively,  for  the  i1*1  cell  (Walpole,  1978). 

If  the  observed  frequencies  are  close  to  the  expected  frequencies,  then  the  X  value  is 
small  denoting  a  good  fit.  If  the  differences  are  large,  then  the  X  value  is  large  and 
the  fit  is  poor.  The  X  value  of  the  model  is  at  10.57  and  the  value  from  the  X 
distribution  is  16.92,  at  .05  level  of  confidence,  attesting  to  the  accuracy  of  the  model. 
As  seen  from  Figure  3.20,  generally  the  observed  frequency  was  more  than  expected 
at  the  middle  and  at  the  low  percentile  range.  This  is  consistent  with  the  notion  that 
the  model  has  behaved  well,  since  it  meant  that  more  often  than  not  when  the  model 
erred,  it  erred  veiy  little. 


44 


Chapter  4 


CONCLUSIONS 


4.1  GENERAL 

This  logistic  regression  model  did  very  well  in  quantifying  the  effects  of  the  variables 
considered  on  attrition  levels.  It  not  only  provided  a  good  approximation  with  respect 
to  the  sample  group,  but  also  behaved  well  with  the  5%  hold-out  sample,  which  did  not 
take  an  active  role  in  the  model  calibration.  However,  it  must  be  pointed  out  that  even 
though  this  model  behaved  well,  it  is  only  one  good  model  among  a  possible  larger 
group  of  good  models. 

4.2  FACTORS  CONTRIBUTING  TO  ATTRITION 

The  results  of  the  analysis  showed  that  four  significant  characteristics  related  to 
attrition  are:  age,  level  of  education,  aptitude  test  score,  and  entry  status. 

Age  and  entry  status  displayed  positive  effects  on  the  rate  of  attrition,  i.e.,  as  age 
increased  and/or  entry  status  included  a  waiver,  the  attrition  risk  for  that  recruit 
increased.  Education  and  aptitude  test  score  ,  on  the  other  hand,  had  negative  effects 
on  the  attrition  rate.  The  higher  one's  level  of  education  and/or  aptitude  test  score,  the 
higher  the  probability  that  one  will  remain  in  service  until  completion  of  one's  obligated 
tour.  Moreover,  a  probabilistic  model  based  solely  on  these  four  factors  fit  the  data 
set  down  to  a  small  level  of  discrepancy  that  can  be  readily  explained  as  statistical 
noise. 


45 


4.3  POLICY  IMPLICATIONS  AND  FUTURE  RESEARCH 


4.3.1  POLICY  IMPLICATION 

U.S.  Army  leaders  and  manpower  planners  might  find  this  study  to  be  useful  in 
improving  the  attrition  rate  of  future  NPS  high-quality  male  recruits  of  an  accession 
cohort.  There  are  several  courses  of  action  (  CA  ): 

1.  Increase  enlistment  of  individuals  with  the  traits  of  groupings  which  had 
lower  attrition  rates  than  the  cohort  attrition  rate. 

2.  Decrease  enlistment  of  individuals  with  the  traits  of  groupings  which  had 
higher  attrition  rates  than  the  cohort  attrition  rate. 

3.  Combine  both  of  the  above  courses  of  action. 

CA  1  will  increase  the  number  of  recruits  with  low  attrition  rates,  thereby  decreasing 
the  overall  attrition  rate  of  the  entire  cohort.  CA  2  will  decrease  the  number  of  recruits 
with  high  attrition  rates,  again  the  net  effect  is  a  lower  overall  cohort  attrition  rate. 

CA  3  will  have  similar  effects. 

The  use  of  incentives  and  the  increase  of  emphasis  in  recruiting  a  particular  segment 
of  the  eligible  population  are  legitimate  possibilities.  The  types  and  scope  of  the 
incentives  are  beyond  the  objective  of  this  study. 

4.2.2  FUTURE  RESEARCH 

This  study  looked  only  at  the  high-quality  male  population.  The  logistic  regression 
model  formulation  and  analysis  should  be  done  for  the  entire  population  of  a  NPS 


46 


accession  cohort.  The  result,  I  propose,  would  be  similar,  in  that  age  and  entry  status 
would  display  positive  associations  with  the  rate  of  attrition,  while  education  and 
aptitude  test  score  conversely  would  have  negative  effects.  This  might  suggest  a 
recruiting  strategy  under  which  readiness  posture  of  the  Army  is  enhanced. 

A  study  should  be  conducted  to  determine  the  most  cost  effective  means  to  increase 
the  numbers  of  desired  recruits,  recommended  by  this  study. 


47 


APPENDIX  A 


WAIVER  CATEGORIES 


1.  Age 

2.  Number  of  Dependents 

3.  Mental  Qualification 

4.  Moral  Qualification 

5.  Previous  Disqualification  /  Separation 

6.  Lost  Time 

7.  Physical  Qualification 

8.  Sole  Surviving  Member 

9.  Education 

10.  Alien 

11.  Security  Risk 

12.  Conscientious  Objector 

13.  Pay  Grade 

14.  Skill  Requirements 

15.  Predictor  Requirements 


48 


BIBLIOGRAPHY 


t 


Ante!,  John,  Hosek,  James  R.,  and  Chritine  E.  Peterson,  Military  Enlistment  and 
Attrition,  The  RAND  Corporation  Report  R-3510-FMP,  June  1987. 

Ashton,  Winifred  D.,  The  Logit  Transformation  with  Special  Reference  to  Its  Use  in 
Bioassay,  Charles  Griffin  and  Company  Limited,  London,  1972. 

Black,  Matthew,  and  Thomas  Fraker,  "First-Term  Attrition  of  High  School  Graduates 
in  the  Military,"  in  Curt  Gilroy  (ed.).  Army  Manpower  Economics,  Westview 
Press,  Boulder,  Colorado,  1986. 

Breiman,  Leo,  Statistics:  With  a  View  Toward  Applications,  Houghton  Mifflin, 
Boston,  1973. 

Buddin,  Richard,  Analysis  of  Early  Military  Attrition  Behavior,  The  RAND 
Corporation,  R-3609-MIL,  July  1984. 

Buddin,  Richard,  Trends  in  Attrition  of  High  Quality  Military  Recruits,  The  RAND 
Corporation,  R-3539-FMP,  August  1988. 

Buddin,  Richard,  and  Christina  Witsberger,  Reducing  the  Air  Force  Male  Enlistment 
Requiremnt  -  Effects  on  Recruiting  Projection  of  the  Other  Services,  The 
RAND  Corporation  ,  R-3265-AF,  March  1985. 

Cotterman,  Robert  F.,  Forecasting  Enlistment  Supply,  The  RAND  Corporation, 
R-3252-FMP,  July  1986. 

Cox,  D.R.,  Analysis  of  Binary  Data,  Chapman  and  Hall,  London,  1977. 

Everitt,  B.S.,  and  G.  Dunn,  Advanced  Methods  of  Data  Exploration  and  Modelling, 
Heinemann  Educational  Books,  London,  1983,  pp.  154  -  175. 

Fujikoshi,  Yasunori,  Selection  of  Variables  in  Discriminant  Analysis  and  Canonical 
Correlation  Analysis,  (P.R.  Krishnaiah,  ed.).  Multivariate  Analysis  -  VI,  North 
Holland  Amsterdam,  Proceedings  of  the  Sixth  International  Symposium  on 
Multivariate  Analysis,  pp.  219-236. 

Grissmer,  David  W.,  and  Sheila  Nataraj  Kirby,  Attrition  of  Non-Prior  Service 
Reservists  in  the  Army  National  Guard  and  Army  Reserve,  The  RAND 
Corporation,  R-3267-RA,  October  1985. 

Halperin,  M.,  Blackwelder,  W.C.,  and  J.I.  Vorter, "  Estimation  of  the  Multivariate 
Logistic  Risk  Function:  A  Comparison  of  the  Discriminant  Function  and 
Maximum  Likelihood  Approaches,"  Journal  of  Chronic  Diseases,  Vol.  24,  1971, 
pp.  125  -  158. 

Hanushek,  Eric  .,  and  John  E.  Jackson,  Statistical  Methods  for  Social  Scientists, 
Academic  Press,  New  York,  1977,  pp.  325  -  344. 


49 


■4 


Home,  David  K.,  The  Impact  of  Soldier  Quality  on  Performance  in  the  Army,  Technical 
Report  708,  Manpower  and  Personnel  Research  Laboratory,  Research  Institute 
for  the  Behavioral  and  Social  Sciences,  April  1986. 

Hosek,  James  R.,  Peterson,  Christine  E.,  and  Rick  A.  Eden,  Educational  Expectations 
and  Enlistment  Decisions,  The  RAND  Corporation,  R-3350-FMP,  March 
1986. 

May,  Laurie  J.,  and  Jacquelyn  Hughes,  Estimating  the  Cost  of  Attrition  of  First  Term 
Enlistees  in  the  Marine  Corps,  Marine  Corps  Operations  Analysis  Group 
CRM86- 168,  June  1986. 

Morris,  Carl  N.,  and  John  E.  Rolph,  Introduction  to  Data  Analysis  and  Statistical 
Inference,  Prentice-Hall,  Englewood  Cliffs,  New  Jersey,  1981. 

Neter,  John,  Wasserman,  William,  and  Michael  H.  Kutner,  Applied  Linear  Regression 
Models,  Irwin,  Homewood,  Illinois,  1983,  pp.  328  -  376. 

Polich,  J.  Michael,  Dertouzoz,  James  N.,  and  S.  James  Press,  The  Enlistment  Bonus 
Experiment,  The  RAND  Corporation,  R-3353-FMP,  April  1986. 

Quester,  Aline,  and  Martha  S.  Murray,  Attrition  from  Navy  Enlistment  Contracts, 
Naval  Planning  Program  and  Logistic  Division  CRM86-12,  January  1986. 

Stolzenberg,  Ross  M.,  and  John  D.  Winkler,  Voluntary  Terminations  from  Military 
Service,  The  RAND  Corporation,  R-321 1-MIL,  May  1983. 

Toomepuu,  Juri,  Costs  and  Benefits  of  Quality  Soldiers,  USAREC  Research  Note 
86-1,  September  1986. 

Toomepuu,  Juri,  Education  and  Military  Manpower  Requirements,  Presentation  to 
Annual  Meeting  of  the  Council  of  Chief  State  School  Officers,  U.S.  Army 
Recruiting  Command,  November  15, 1986. 

Vinod,  Hrishikesh  D.,  and  Aman  Ullah,  Recent  Advances  in  Regression  Methods, 
Marcel  Dekker,  Inc.,  New  York,  1981,  pp.  300-302. 

Walpole,  Ronald  E.,  and  Raymond  H.  Myers,  Probability  and  Statistics  for  Engineers 
and  Scientists,  Macmillan,  New  York,  1978,  pp.  189  -  280. 

Wilkinson,  Leland,  SYGRAPH,  SYSTAT  Inc.,  Evanston,  Illinois,  1988. 

Wilkinson,  Leland,  SYSTAT,  SYSTAT  Inc.,  Evanston,  Illinois,  1988. 


50 


